Skip to content

Conversation

chriselion
Copy link
Contributor

@chriselion chriselion commented Feb 22, 2021

Proposed change(s)

Very rough WIP to convert GAIL and BC to use a new DemonstrationProvider interface.

The eventual goal (beyond the scope of this PR) is to let users define their own DemonstrationProvider interface in a plugin and use that instead of the given LocalDemonstrationProvider.

In this PR:

  • Add DemonstrationProvider interface
  • Break up the previous demo_loader() code into demonstration_proto_utils and LocalDemonstrationProvider impl
  • Convert GAIL and BC to use DemonstrationProvider.

TODO

  • Fix tets
  • Decide what fields should actually be on DemonstrationExample/DemonstratoinTrajectory
  • Remove old demo_loader code

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

https://jira.unity3d.com/browse/MLA-1734

Types of change(s)

  • New feature
  • Code refactor

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)

Other comments



@timed
def load_demonstration(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was demo_loader.load_demonstration

return trajectories

@staticmethod
def _get_demo_files(path: str) -> List[str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from demo_loader.get_demo_files

def get_behavior_spec(self) -> BehaviorSpec:
return self._behavior_spec

def pop_trajectories(self) -> List[DemonstrationTrajectory]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add docstrings here. But the idea is that GAIL, etc could be converted to use pop_trajectories() directly. Then if we want DemonstrationProviders to be able to load new demonstrations on the fly, the logic can be kept in the DemonstrationProvider and the consumer doesn't need to know about it, it just gets a fresh batch of trajectories.

from mlagents.trainers.trajectory import ObsUtil


class DemonstrationExperience(NamedTuple):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are trimmed down versions of AgentExperience and Trajectory classes, based on what's currently in demo_loader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I feel like we shouldn't duplicate the conversion code here between AgentExperience and Trajectory (esp. with the teammate observations coming in, it becomes quite a fat function - and at some point I imagine we'll have teammate demonstrations as well).

Wonder if we can have a BaseAgentExperience be the base class that is used here and in the AgentProcessor, and have the AgentExperience (PolicyAgentExperience?) inherit from it? Or some other way of composing these two.

Base automatically changed from master to main February 25, 2021 19:16
@chriselion
Copy link
Contributor Author

Shelving this for now, since research won't be able to make use of it for a while, and we're still not sure on the trajectory/experience handling.

@chriselion chriselion closed this Mar 3, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants